AITopics | statistical accuracy

Collaborating Authors

statistical accuracy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Adaptive Newton Method for Empirical Risk Minimization to Statistical Accuracy

Aryan Mokhtari, Hadi Daneshmand, Aurelien Lucchi, Thomas Hofmann, Alejandro Ribeiro

Neural Information Processing SystemsMay-1-2026, 06:06:28 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, statistical accuracy, (19 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States (0.47)
North America > Canada > Quebec (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.51)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.47)

Add feedback

Appendix details of the proposed method

Neural Information Processing SystemsApr-25-2026, 01:02:24 GMT

In this section, we provide further intuition about the proposed AdaQN method. As shown in Figure 5, in AdaQN we need to ensure that the approximate solution of the ERM problem with m samples denoted by wm is within the superlinear convergence neighborhood of BFGS for the ERM problem with n = 2msamples. Here, w m and w n are the optimal solutions of the risks Rm and Rn corresponding to the sets Sm and Sn with mand nsamples, respectively, where Sm Sn. The statistical accuracy region of Rm is denoted by a blue circle, the statistical accuracy region of Rn is denoted by a red circle, and the superlinear convergence neighborhood of BFGS for Rn is denoted by a dotted purple circle. As we observe, any point within the statistical accuracy of w m is within the superlinear convergence neighborhood of BFGS for Rn.

artificial intelligence, machine learning, statistical accuracy, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

1f9b616faddedc02339603f3b37d196c-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 01:02:21 GMT

artificial intelligence, machine learning, statistical accuracy, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas (0.28)
North America > Canada (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

More Supervision, Less Computation: Statistical-Computational Tradeoffs in Weakly Supervised Learning

Neural Information Processing SystemsMar-17-2026, 08:28:38 GMT

We consider the weakly supervised binary classification problem where the labels are randomly flipped with probability $1-\alpha$. Although there exist numerous algorithms for this problem, it remains theoretically unexplored how the statistical accuracies and computational efficiency of these algorithms depend on the degree of supervision, which is quantified by $\alpha$. In this paper, we characterize the effect of $\alpha$ by establishing the information-theoretic and computational boundaries, namely, the minimax-optimal statistical accuracy that can be achieved by all algorithms, and polynomial-time algorithms under an oracle computational model. For small $\alpha$, our result shows a gap between these two boundaries, which represents the computational price of achieving the information-theoretic boundary due to the lack of supervision. Interestingly, we also show that this gap narrows as $\alpha$ increases. In other words, having more supervision, i.e., more correct labels, not only improves the optimal statistical accuracy as expected, but also enhances the computational efficiency for achieving such accuracy.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.60)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.99)

Add feedback

1f9b616faddedc02339603f3b37d196c-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 18:54:56 GMT

erm problem, quasi-newton method, statistical accuracy, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > Canada > Quebec > Montreal (0.04)
Asia > Middle East > Jordan (0.04)
North America > Puerto Rico > San Juan > San Juan (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.88)

Add feedback

First-Order Adaptive Sample Size Methods to Reduce Complexity of Empirical Risk Minimization

Neural Information Processing SystemsNov-21-2025, 15:43:53 GMT

This paper studies empirical risk minimization (ERM) problems for large-scale datasets and incorporates the idea of adaptive sample size methods to improve the guaranteed convergence bounds for first-order stochastic and deterministic methods. In contrast to traditional methods that attempt to solve the ERM problem corresponding to the full dataset directly, adaptive sample size schemes start with a small number of samples and solve the corresponding ERM problem to its statistical accuracy. The sample size is then grown geometrically -- e.g., scaling by a factor of two -- and use the solution of the previous ERM as a warm start for the new ERM. Theoretical analyses show that the use of adaptive sample size methods reduces the overall computational cost of achieving the statistical accuracy of the whole dataset for a broad range of deterministic and stochastic first-order methods. The gains are specific to the choice of method. When particularized to, e.g., accelerated gradient descent and stochastic variance reduce gradient, the computational cost advantage is a logarithm of the number of training samples. Numerical experiments on various datasets confirm theoretical claims and showcase the gains of using the proposed adaptive sample size scheme.

empirical risk minimization, first-order adaptive sample size method, reduce complexity, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

Adaptive Newton Method for Empirical Risk Minimization to Statistical Accuracy

Neural Information Processing SystemsNov-21-2025, 15:07:50 GMT

We consider empirical risk minimization for large-scale datasets. We introduce Ada Newton as an adaptive algorithm that uses Newton's method with adaptive sample sizes. The main idea of Ada Newton is to increase the size of the training set by a factor larger than one in a way that the minimization variable for the current training set is in the local neighborhood of the optimal argument of the next training set. This allows to exploit the quadratic convergence property of Newton's method and reach the statistical accuracy of each training set with only one iteration of Newton's method. We show theoretically that we can iteratively increase the sample size while applying single Newton iterations without line search and staying within the statistical accuracy of the regularized empirical risk. In particular, we can double the size of the training set in each iteration when the number of samples is sufficiently large. Numerical experiments on various datasets confirm the possibility of increasing the sample size by factor 2 at each iteration which implies that Ada Newton achieves the statistical accuracy of the full training set with about two passes over the dataset.

adaptive newton method, empirical risk minimization, name change, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

More Supervision, Less Computation: Statistical-Computational Tradeoffs in Weakly Supervised Learning

Neural Information Processing SystemsNov-21-2025, 14:43:54 GMT

statistical-computational tradeoff, supervision, weakly supervised learning, (7 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.60)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.99)

Add feedback

First-Order Adaptive Sample Size Methods to Reduce Complexity of Empirical Risk Minimization

Aryan Mokhtari, Alejandro Ribeiro

Neural Information Processing SystemsNov-21-2025, 11:27:27 GMT

This paper studies empirical risk minimization (ERM) problems for large-scale datasets and incorporates the idea of adaptive sample size methods to improve the guaranteed convergence bounds for first-order stochastic and deterministic methods. In contrast to traditional methods that attempt to solve the ERM problem corresponding to the full dataset directly, adaptive sample size schemes start with a small number of samples and solve the corresponding ERM problem to its statistical accuracy. The sample size is then grown geometrically - e.g., scaling by a factor of two - and use the solution of the previous ERM as a warm start for the new ERM. Theoretical analyses show that the use of adaptive sample size methods reduces the overall computational cost of achieving the statistical accuracy of the whole dataset for a broad range of deterministic and stochastic first-order methods. The gains are specific to the choice of method. When particularized to, e.g., accelerated gradient descent and stochastic variance reduce gradient, the computational cost advantage is a logarithm of the number of training samples. Numerical experiments on various datasets confirm theoretical claims and showcase the gains of using the proposed adaptive sample size scheme.

artificial intelligence, machine learning, statistical accuracy, (14 more...)

Neural Information Processing Systems

Country: